AITopics

Country:

Asia > Japan > Honshū > Chūbu > Ishikawa Prefecture > Kanazawa (0.04)
North America > United States > Hawaii > Honolulu County > Honolulu (0.04)
North America > Canada > British Columbia (0.04)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Neural Information Processing SystemsFeb-9-2026, 01:00:25 GMT

1d4c4047f5d82699cf01b19c991ec56d-Paper-Datasets_and_Benchmarks_Track.pdf

blindway, dataset, participant, (14 more...)

Country:

North America > United States > Maryland > Prince George's County > College Park (0.04)
Asia > Japan > Honshū > Chūbu > Ishikawa Prefecture > Kanazawa (0.04)
Africa > Mali (0.04)

Genre: Research Report > New Finding (0.46)

Industry:

Health & Medicine (1.00)
Information Technology (0.68)
Automobiles & Trucks (0.68)
Transportation > Ground > Road (0.46)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language (0.94)
Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (0.66)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

arXiv.org Artificial IntelligenceDec-9-2025

Switch-JustDance: Benchmarking Whole Body Motion Tracking Policies Using a Commercial Console Game

Kim, Jeonghwan, Kim, Wontaek, Lu, Yidan, Cheng, Jin, Zargarbashi, Fatemeh, Zeng, Zicheng, Qi, Zekun, Dou, Zhiyang, Sontakke, Nitish, Baek, Donghoon, Ha, Sehoon, Li, Tianyu

Recent advances in whole-body robot control have enabled humanoid and legged robots to perform increasingly agile and coordinated motions. However, standardized benchmarks for evaluating these capabilities in real-world settings, and in direct comparison to humans, remain scarce. Existing evaluations often rely on pre-collected human motion datasets or simulation-based experiments, which limit reproducibility, overlook hardware factors, and hinder fair human-robot comparisons. We present Switch-JustDance, a low-cost and reproducible benchmarking pipeline that leverages motion-sensing console games, Just Dance on the Nintendo Switch, to evaluate robot whole-body control. Using Just Dance on the Nintendo Switch as a representative platform, Switch-JustDance converts in-game choreography into robot-executable motions through streaming, motion reconstruction, and motion retargeting modules and enables users to evaluate controller performance through the game's built-in scoring system. We first validate the evaluation properties of Just Dance, analyzing its reliability, validity, sensitivity, and potential sources of bias. Our results show that the platform provides consistent and interpretable performance measures, making it a suitable tool for benchmarking embodied AI. Building on this foundation, we benchmark three state-of-the-art humanoid whole-body controllers on hardware and provide insights into their relative strengths and limitations.

artificial intelligence, controller, image understanding, (18 more...)

2511.17925

Genre: Research Report > New Finding (1.00)

Industry: Leisure & Entertainment > Games > Computer Games (1.00)

Technology:

Information Technology > Artificial Intelligence > Robots > Humanoid Robots (0.49)
Information Technology > Artificial Intelligence > Vision > Image Understanding (0.42)

Neural Information Processing SystemsNov-21-2025, 15:56:56 GMT

Self-supervised Learning of Motion Capture

Current state-of-the-art solutions for motion capture from a single camera are optimization driven: they optimize the parameters of a 3D human model so that its re-projection matches measurements in the video (e.g.

motion capture, name change, self-supervised learning, (6 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.86)

Hsiao-Yu Tung, Hsiao-Wei Tung, Ersin Yumer, Katerina Fragkiadaki

Self-supervised Learning of Motion Capture

Neural Information Processing SystemsNov-21-2025, 12:18:11 GMT

artificial intelligence, machine learning, segmentation, (18 more...)

Country:

North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
North America > United States > Ohio > Franklin County > Columbus (0.04)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.04)
(2 more...)

Genre: Research Report > Promising Solution (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Vision > Video Understanding (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.83)

Kasap, Deniz, Najafi, Taraneh Aminosharieh, Thevenot, Jérôme Paul Rémy, Dan, Jonathan, Albini, Stefano, Atienza, David

VersaPants: A Loose-Fitting Textile Capacitive Sensing System for Lower-Body Motion Capture

arXiv.org Artificial IntelligenceNov-21-2025

We present VersaPants, the first loose-fitting, textile-based capacitive sensing system for lower-body motion capture, built on the open-hardware VersaSens platform. By integrating conductive textile patches and a compact acquisition unit into a pair of pants, the system reconstructs lower-body pose without compromising comfort. Unlike IMU-based systems that require user-specific fitting or camera-based methods that compromise privacy, our approach operates without fitting adjustments and preserves user privacy. VersaPants is a custom-designed smart garment featuring 6 capacitive channels per leg. We employ a lightweight Transformer-based deep learning model that maps capacitance signals to joint angles, enabling embedded implementation on edge platforms. To test our system, we collected approximately 3.7 hours of motion data from 11 participants performing 16 daily and exercise-based movements. The model achieves a mean per-joint position error (MPJPE) of 11.96 cm and a mean per-joint angle error (MPJAE) of 12.3 degrees across the hip, knee, and ankle joints, indicating the model's ability to generalize to unseen users and movements. A comparative analysis of existing textile-based deep learning architectures reveals that our model achieves competitive reconstruction performance with up to 22 times fewer parameters and 18 times fewer FLOPs, enabling real-time inference at 42 FPS on a commercial smartwatch without quantization. These results position VersaPants as a promising step toward scalable, comfortable, and embedded motion-capture solutions for fitness, healthcare, and wellbeing applications.

artificial intelligence, deep learning, machine learning, (16 more...)

2511.16346

Country:

North America > United States (1.00)
Europe (1.00)

Genre: Research Report > New Finding (0.46)

Industry:

Materials > Chemicals (0.67)
Health & Medicine > Consumer Health (0.46)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

arXiv.org Artificial IntelligenceOct-29-2025

How Pragmatics Shape Articulation: A Computational Case Study in STEM ASL Discourse

Imai, Saki, Kezar, Lee, Aichler, Laurel, Inan, Mert, Walker, Erin, Wooten, Alicia, Quandt, Lorna, Alikhani, Malihe

Most state-of-the-art sign language models are trained on interpreter or isolated vocabulary data, which overlooks the variability that characterizes natural dialogue. However, human communication dynamically adapts to contexts and interlocutors through spatiotemporal changes and articulation style. This specifically manifests itself in educational settings, where novel vocabularies are used by teachers, and students. To address this gap, we collect a motion capture dataset of American Sign Language (ASL) STEM (Science, Technology, Engineering, and Mathematics) dialogue that enables quantitative comparison between dyadic interactive signing, solo signed lecture, and interpreted articles. Using continuous kinematic features, we disentangle dialogue-specific entrainment from individual effort reduction and show spatiotemporal changes across repeated mentions of STEM terms. On average, dialogue signs are 24.6%-44.6% shorter in duration than the isolated signs, and show significant reductions absent in monologue contexts. Finally, we evaluate sign embedding models on their ability to recognize STEM signs and approximate how entrained the participants become over time. Our study bridges linguistic analysis and computational modeling to understand how pragmatics shape sign articulation and its representation in sign language technologies.

artificial intelligence, machine learning, natural language, (19 more...)

2510.23842

Country:

North America > United States > New York (0.28)
North America > United States > Minnesota (0.28)

Genre: Research Report > New Finding (1.00)

Industry: Education > Curriculum > Subject-Specific Education (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

arXiv.org Artificial IntelligenceOct-27-2025

Group Inertial Poser: Multi-Person Pose and Global Translation from Sparse Inertial Sensors and Ultra-Wideband Ranging

Xue, Ying, Jiang, Jiaxi, Armani, Rayan, Hollidt, Dominik, Liao, Yi-Chi, Holz, Christian

Tracking human full-body motion using sparse wearable inertial measurement units (IMUs) overcomes the limitations of occlusion and instrumentation of the environment inherent in vision-based approaches. However, purely IMU-based tracking compromises translation estimates and accurate relative positioning between individuals, as inertial cues are inherently self-referential and provide no direct spatial reference for others. In this paper, we present a novel approach for robustly estimating body poses and global translation for multiple individuals by leveraging the distances between sparse wearable sensors - both on each individual and across multiple individuals. Our method Group Inertial Poser estimates these absolute distances between pairs of sensors from ultra-wideband ranging (UWB) and fuses them with inertial observations as input into structured state-space models to integrate temporal motion patterns for precise 3D pose estimation. Our novel two-step optimization further leverages the estimated distances for accurately tracking people's global trajectories through the world. We also introduce GIP-DB, the first IMU+UWB dataset for two-person tracking, which comprises 200 minutes of motion recordings from 14 participants. In our evaluation, Group Inertial Poser outperforms previous state-of-the-art methods in accuracy and robustness across synthetic and real-world data, showing the promise of IMU+UWB-based multi-human motion capture in the wild. Code, models, dataset: https://github.com/eth-siplab/GroupInertialPoser

artificial intelligence, estimation, machine learning, (17 more...)